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ABSTRACT  (continued): 


are  typically  invoked  in  the  interpretation  of  simple  line  drawings  and  (2)  to  show 
that  the  “lattice”  framework  specifies  all  of  these  interpretations,  placing  them  in 
proper  rank  order. 

In  parallel,  we  are  also  exploring  two  other  models  for  merging  bottom-up  and 
top>-down  information,  both  of  which  are  neural-based.  One,  called  sequence-seeking 
(Ullman),  proposes  a  network  heirarchy  where  a  sequence  of  transformations  of 
both  the  input  data  and  the  target  model  occur  in  parallel,  searching  for  the  proper 
mapping  that  brings  each  into  register.  The  proposal  makes  a  special  effort  to 
incorporate  what  we  currently  know  about  cortical  machinery,  and  also  has  triggered 
psychophysical  experiments.  (We  have  not  yet  explored  the  relations  between  the 
lattice  and  sequence-seeking  proposals.) 

Finally,  there  are  some  studies  related  to  our  ability  to  switch  between  sets 
of  premises,  or  to  alter  our  models.  These  assume  “neural-like”  states  that  must 
interplay  with  one  another.  At  the  moment,  we  can  show  that  this  process  is  not 
Poisson,  as  proposed  for  very  low-level  multi-stable  percepts,  but  is  more  likely 
characterized  as  a  non-linear,  dynamical  system. 
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As  in  the  previous  year,  our  research  eflFort  breaks  down  roughly  into  three  cate¬ 
gories; 

1.  Theoretical  Studies;  A  Formal  Freunework  for  Percepts 

A  Neural  Proposal  for  Recognition 

2.  Experiments  related  to  the  above 

3.  Studies  of  Dynamical  Systems  Behavior  (Chaos  in  Percepts) 


1.  Theoretical  Studies 

Here  we  have  two  main  thrusts,  one  concerned  with  the  logical,  formal  structure 
that  underlies  the  act  of  perception  (Richards,  Jepson  k  Feldman)  and  the  other 
a  proposal  for  how  neural  machinery  might  match  the  incoming  sense  data  to  an 
internal  model  (Ullman). 

1.1  Logic  in  Percepts  (Richards  &  Jepson) 

This  work  began  about  two  years  ago,  when  we  realized  that  although  many  are 
studying  “Perception”,  there  is  no  formal  dehnition  of  just  what  a  percept  is.  With¬ 
out  such  a  definition,  how  can  we  decide  whether  a  particular  machine  or  biological 
state  (or  model  output)  qualifies  as  a  perception?  Furthermore,  how  can  we  build  a 
true  theory  of  a  percept  without  a  clear  specification  of  the  kinds  of  state  variables, 
operations,  and  “language”  that  are  entailed? 

Our  first  answer  to  “What  Is  a  Percept?”  was  to  note  that  perceptions  are 
inductive  inferences.  When  conclusions  about  a  state  in  the  world  are  drawn  from 
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the  sense  data,  then  (fallible)  premises  must  be  proposed  to  complete  the  inference 
process.  Because  these  premises  are  fallible  -  they  are  simply  intelligent  guesses  - 
a  partial  order  can  be  placed  upon  possible  interpretations  of  the  sense  data,  given 
the  chosen  premises.  The  order  is  determined  by  ranking  the  premise  combination 
that  must  be  “given  up” .  Within  such  an  order,  a  percept  can  then  be  defined  as  a 
maximal  node.  (This  is  not  equivalent  to  minimizing  the  faulted  premises.)  The  key 
to  locating  these  maximal  nodes  is  to  be  able  to  reason  about  the  consistency  of  the 
data,  given  the  current  state  of  “top-down”  knowledge  (Jepson  k.  Richards,  1991). 
In  a  forthcoming  paper,  “Lattice  Framework  for  Integrating  Vision  Modules”,  we 
compare  a  specialized  version  of  our  proposal  to  several  others,  such  as  probabalistic 
reasoning  and  Hough  transform  schemes  that  are  often  used  to  resolve  conflicting 
conclusions  reached  by  different  sense  modules.  (A  simple  example  of  such  a  conflict 
would  be  when  you  view  the  TV  screen;  motion  information  implies  the  scene  is 
three  dimensional,  but  your  binocular  system  claims  the  scene  is  flat.) 

More  recently,  we  have  made  some  major  revisions  in  our  “Lattice  Theory  for 
Percepts”,  showing  more  clearly  the  structure  of  the  reasoning  process  involved, 
as  well  as  a  further  clarification  of  the  components  needed  to  support  a  formal 
theory  of  perception.  Our  aim  here  is  to  show  how  the  perceptual  interpretation 
process  can  be  cast  in  terms  of  first-order  logic.  Thus,  in  Richard  Gregory’s  or  Irwin 
Rock’s  terms,  we  actually  put  the  logic  into  percepts,  allowing  us  eventually  to  run 
a  program  that  resisons  about  the  sense  data  (a  picture). 


1.2  A  Neural  Proposal  (UUman) 


At  a  completely  different  level,  Ullman  has  proposed  a  network  heirarchy  scheme 
for  how  “bottom-up”  information  comes  into  register  with  “top-down”  models.  The 
basic  process,  termed  “sequence-seeking” ,  is  a  seucb  for  a  sequence  of  mappings  or 
transformations  linking  a  source  and  target  representation.  The  search  is  bidirec¬ 
tional  throughout  the  heirarchy  -  “bottom-up”  as  well  as  “top-down” .  The  novel 
part  of  the  proposal  is  that  the  two  searches  are  performed  along  two  separate, 
complementary  pathways,  one  ascending,  the  other  descending.  When  a  matching 
pattern  is  found,  regardless  of  the  level,  then  a  chain  of  activity  linking  the  source 
and  target  is  generated,  facilitating  one  particular  path  in  the  network.  The  pro¬ 
posal  is  largely  consistent  with  what  is  known  about  cortical  machinery,  specifically 
the  interplay  between  the  various  visual  areas,  and  hence  is  a  hypothesis  about  the 
basic  scheme  of  information  processing  in  the  neocortex  (and  thalamus).  Experi¬ 
ments  related  to  this  proposal  are  currently  underway  -  see  below.  Accession  For 
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1.3  ‘‘Good”  Features  and  Categories  (Richards,  Jepson  ic.  Feldman) 

Certain  image  properties,  such  as  collinearity,  parallel  lines,  V-junctions,  allow 
strong  inferences  about  the  3D  configuration  that  creates  these  image  features  (Bin- 
ford,  Lowe).  Other  image  projectons,  such  as  “X”,  do  not  generally  support  strong 
inferences  because  either  they  occur  generically  in  the  image,  or  because  they  can 
arise  in  many  different  ways,  as  a  T-junction  can.  We  now  have  completed  a  speci¬ 
fication  for  just  what  constitutes  a  “good”  feature  -  i.e.  one  with  strong  inferential 
powers.  In  this  specification,  we  show  how  a  measure  can  be  placed  on  the  inductive 
power  of  a  feature,  namely  the  codimension  of  the  arrangement  in  the  chosen  model 
class  (see  Richards  ic  Jepson,  1992).  This  work  is  important  for  two  reasons:  (1) 
it  tells  us  which  image  features  are  worth  computing  and  (2)  these  features  then, 
in  turn,  dictate  just  how  a  “feature  space”  should  be  subdivided  to  form  useful 
categories  within  that  space.  (See  Richards  ii  Jepson,  1992,  for  “Feature”  analysis, 
and  Feldman,  1992,  for  preliminary  work  on  “Categories  as  Subspaces” .)  Curiously, 
it  turns  out  that  ideas  from  catastrophy  theory  (Poston  ii  Stewart)  and  solid  shape 
(Koenderink)  provide  some  of  the  formal  structure  for  both  the  feature  and  category 
work.  This  is  because  a  feature  space  can  be  considered  to  be  a  surface  in  a  partic¬ 
ular  manifold,  and  hence  the  problem  reduces  in  part  to  when  such  surfaces  should 
be  distinguished  and  how  coordinate  frames  should  be  assigned  to  them  (such  as 
choosing  principal  curvatures).  [Note:  a  longer-term  project  related  to  categories 
and  conceptual  theories  -  such  as  in  physics  -  is  underway  with  J.J.  Koenderink.] 


2.0  Experiments 

These  will  be  subdivided  roughly  into  sections  similar  to  the  above. 


2.1  Lattice  Theory  for  Percepts  (Richards  &  Jepson) 

Our  principal  experimental  result  over  the  past  year  is  a  study  of  a  rigid  3D  con¬ 
figuration  that  appears  non-rigid:  the  rotating  Ames  Trapezoid  window  plus  stick. 
When  seen  in  kinetic  depth,  such  as  on  a  TV  screen,  the  motion  of  the  configura¬ 
tion  satisfies  all  theories  (such  as  Ullman’s,  Hoffman  ic  Bennett,  Huang,  etc.)  that 
predict  rigid  motion  (indeed,  the  motion  is  simulated  to  arise  from  a  rigid  ob'  ci). 
Yet  everyone  sees  the  bar  moving  non-rigidly  with  respect  to  the  window.  Our 
“Lattice”  explanation  for  this  percept  is  given  in  the  attached  SMC  paper:  “A  Lat¬ 
tice  Framework  for  Integrating  Vision  Modules”.  As  proposed  above  in  Section  1, 
our  answer  is  simply  that,  given  very  reasonable  premises  about  structures  in  the 
world,  a  non-rigid  interpretation  will  be  preferred.  (A  key  to  the  eAplanation  is  that 
part-decompostions  are  a  necessary  step  in  image  understanding,  and  once  we  have 
committed  to  “parts”,  the  space  of  plausible  solutions  becomes  very  restricted.) 


3 


RICHARDS 


ANNUAL  REPORT  1992 


We  have  also  completed  an  analysis  of  another  deceptively  complex  configura¬ 
tion  -  a  simple  triangle  with  a  stick  crossing  one  of  its  sides.  The  stick  typically 
appears  to  be  oriented  in  space  like  a  handle,  with  its  end-point  lying  in  the  plane 
of  the  triangle.  Why  should  the  stick  appear  to  have  the  angle  it  does,  and  why 
does  the  endpoint  touch  the  triangle  rather  than  free-float  in  space?  Again,  very 
simple  premises  generate  a  lattice  that  answers  these  questions  and  accounts  for 
our  percepts.  (We  are  in  the  process  of  writing  up  this  study.) 

In  addition  to  the  above,  we  also  have  another  handful  of  simple  displays  under 
analysis.  The  aim  here  is  to  see  if  the  same  collection  of  premises  can  be  used  to 
explain  the  percepts  seen  for  quite  different  displays.  Our  current  emphasis  is  on 
premises  dealing  with  how  objects  are  supported. 


2.2  Neural  Mechanisms  for  Recognition 

In  contrast  to  the  previous  studies  that  address  cognitive  issues,  the  following  ex‘ 
periments  are  aimed  at  understanding  neural  machinery. 


2.2.1  Configuration  Stereopsis  (Richards) 

We  are  just  winding  up  a  study  on  3D  shape  that  relates  to  how  “top-down”  in¬ 
formation  about  fixation  distance  (or  shape)  modulates  angular  disparity.  Because 
binocular  disparity  appears  to  be  computed  in  V2,  this  modulation  must  occur  early 
in  the  visual  pathway  and  hence  is  potentially  accessible  to  psychophysical  probing. 

As  the  distance  to  an  object  increases,  the  angular  disparity  needed  to  measure 
the  actual  3D  configuration  must  decrease  (reaching  zero  at  the  horizon).  However, 
if  we  take  an  object,  say  a  cup,  and  evaluate  its  3D  shape  nearby  versus  far  away, 
the  cup  does  not  appear  to  flatten,  although  the  disparity  signal  becomes  much 
smaller  as  the  distance  increases.  This  suggests  a  rescaling  of  disparity  with  object 
(or  fixation)  distance. 

We  have  conducted  parametric  studies  of  3D  shape  from  stereo  over  a  wide 
range  of  fixation  distances.  The  data  show  that  indeed,  the  depth  measure  asso¬ 
ciated  with  a  fixed  angular  disparity  changes  with  fixation  distance.  The  effect 
is  in  the  direction  needed  to  preserve  the  shape  of  3D  configurations  as  their  dis¬ 
tance  changes,  and  is  roughly  two-thirds  of  what  is  needed  for  a  full  correction. 
This  is  evidence  for  neural  signals  being  modified  at  or  before  the  extraction  of 
binocular  disparity.  Hence  we  have  a  preliminary  “handle”  on  how  a  simple  case  of 
“bottom-up”  information  -  namely  binocular  disparity  -  may  incorporate  a  form  of 
“top-down”  knowledge. 
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2.2.2  Texture  Curvature  (with  Hugh  Wilson) 

This  study  examines  curvature  discrimination  for  edges  created  by  texture  contours, 
and  includes  a  model  incorporating  end-stopped  complex  cells.  (The  manuscript  is 
under  review  by  JOS  A  A  (copy  enclosed).) 


2.3  Computational  Vison 

We  have  two  studies  now  underway  in  this  area,  one  on  shape-from-shading  and 
stereo,  the  other  just  beginning  on  color  texture. 

2.3.1  Shading  and  Stereo  (Dawson  &  Shashua) 

Pseudo  stereopsis  is  when  the  binocular  disparities  of  a  surface,  such  as  a  face,  are 
reversed  but  the  shading  is  not.  The  impression  is  that  the  face  is  “normal”  -  the 
nose,  for  example,  still  points  outward  to  the  viewer. 

We  have  manipulated  noses  using  graphics  techniques  in  order  to  push  them 
inward,  “into  the  head”  so  to  speak,  without  altering  the  shading.  No  one  is  able  to 
see  these  noses  “shoved  in”.  Our  analysis  suggests  that  this  failure  of  stereopsis  is 
simply  due  to  the  shape-from-shading  solution  “overriding”  (in  the  Percepts  Lattice 
sense)  the  weak  stereo  signal  created  by  shaded  rather  than  sharp  contours.  The 
effect  is  not  special  to  faces,  and  occurs  also  for  “playdo”  shapes. 

2.3.2  Color  Texture  (with  D.D.  Hofiman  et  al.  at  Irvine) 

Although  much  work  has  been  devoted  to  understanding  the  appeeirance  of  ho¬ 
mogenous  color  patches,  almost  nothing  is  known  about  how  we  represent  colored 
textures.  Our  approach  is  to  consider  the  spatial  texture  pattern  as  generated  by 
a  Markovian  process,  which  “paints”  different  colors  on  a  surface.  The  problem, 
then,  is  to  recover  the  characteristic  parameters  of  this  underlying  process. 

This  problem  is  almost  ideally  suited  to  the  formalism  described  in  Observer 
Mechanics  (Bennett,  Hoffman  k  Prakash),  because  Markovian  kernels  lie  at  the 
heart  of  this  theory.  On  the  experimental  side,  we  know  from  earlier  work  on  “Tex¬ 
ture  Matching”  that  there  will  be  severe  psychophysical  restrictions  on  discriminable 
patterns,  just  like  in  color  matching,  and  expect  to  find  further  constraints  imposed 
upon  color-texture  matches.  (Julesz  studied  this  briefly  many  years  ago.) 

To  date,  we  have  met  for  three  days  on  this  problem  at  Irvine. 
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3.0  Dynamical  Systems  (Chaos  in  Percepts) 

This  project  continues  to  advance,  although  slowly.  The  underlying  hypothesis  is 
that  switching  between  multi-stable  percepts,  or  switching  between  premises,  must 
be  rapid.  A  chaotic-like,  dynamical  system  would  be  an  attractive  mechanism. 

Finally  (!)  we  have  been  able  to  prove  experimentally  that  very  high  level 
perceptual  processing  is  chaotic,  with  a  dimension  roughly  3.  Our  difficulty  has 
been  in  getting  data  that  is  sufficiently  free  from  noise.  We  have  such  data  now  and 
the  result  is  clear. 

Our  next  step  is  to  develop  a  model  for  the  effects  of  noise  on  a  non-linear 
time  series  process  (with  Jepson)  and  to  show  that  this  model  applies  to  our  earlier, 
noisy  data.  We  then  will  be  able  to  extract  the  underlying  noise-free  dynamical 
process  and  can  proceed  to  model  this  process. 


4.0  Publications  (to  date) 
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In  Preparation: 

Logic  in  percepts  (with  A.  Jepson). 

From  features  to  categories  (with  A.  Jepson  ii  J.  Feldman). 

Configuration  stereopsis  (W.  Richards). 

Choosing  a  coordinate  frame  (with  J.  Brian  Subirana-Vilanova).  (See  “Figure- 
ground  in  visual  perception”  ARVO  1991  for  brief  presentation. 

Shading  and  stereo  (with  B.  Dawson  h:  A.  Shashua). 

Talks: 

University  of  Minnesota  (May  1989)  “‘Perception  and  perceivers” . 

Harvard  University  (Nov.  1989)  “What’s  a  perception?” 

Yale  University  (May  1990)  “What’s  a  percept?” 

University  of  Michigan  (June  1990)  “What’s  a  percept?” 

Cognitive  Science  Society  (July  1990)  “Perception,  computation  and  categoriza¬ 
tion”  . 

Cornell  University  (June  1991)  “What  makes  a  good  feature?” 

York  University  (June  1991)  “Integrating  vision  modules”. 

University  of  Illinois  (Oct.  1991)  (1)  “What’s  a  percept?,  (2)  “Choosing  coordinate 
frames” . 

5.0  Funds  and  Personnel 

At  the  end  of  the  second  fiscal  year  (Oct.  1991)  we  were  over  budget  by  $2,600.  We 
expect  to  reocver  this  overrun  in  1992,  due  to  the  absence  of  Shimon  Ullman,  who 
is  at  Weisman,  but  who  will  return  for  two  months  this  summer  (June  and  July 
1992).  We  adso  expect  to  support  in  part  Jacob  Feldman  over  part  of  this  year. 
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